Predicting plant protein subcellular multi-localization by Chou's PseAAC formulation based multi-label homolog knowledge transfer learning

Suyu Mei

doi:10.1016/j.jtbi.2012.06.028

Predicting plant protein subcellular multi-localization by Chou's PseAAC formulation based multi-label homolog knowledge transfer learning

J Theor Biol. 2012 Oct 7:310:80-7. doi: 10.1016/j.jtbi.2012.06.028. Epub 2012 Jun 27.

Author

Suyu Mei¹

Affiliation

¹ Software College, Shenyang Normal University, Shenyang, China. 061021053@fudan.edu.cn

PMID: 22750634
DOI: 10.1016/j.jtbi.2012.06.028

Abstract

Recent years have witnessed much progress in computational modeling for protein subcellular localization. However, there are far few computational models for predicting plant protein subcellular multi-localization. In this paper, we propose a multi-label multi-kernel transfer learning model for predicting multiple subcellular locations of plant proteins (MLMK-TLM). The method proposes a multi-label confusion matrix and adapts one-against-all multi-class probabilistic outputs to multi-label learning scenario, based on which we further extend our published work MK-TLM (multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization) for plant protein subcellular multi-localization. By proper homolog knowledge transfer, MLMK-TLM is applicable to novel plant protein subcellular localization in multi-label learning scenario. The experiments on plant protein benchmark dataset show that MLMK-TLM outperforms the baseline model. Unlike the existing models, MLMK-TLM also reports its misleading tendency, which is important for comprehensive survey of model's multi-labeling performance.

MeSH terms

Artificial Intelligence*
Computational Biology / methods*
Databases, Protein
Plant Proteins / metabolism*
Protein Transport
Sequence Homology, Amino Acid*
Software*
Subcellular Fractions / metabolism

Substances

Plant Proteins